IEEE Journal of Biomedical and Health Informatics
● Institute of Electrical and Electronics Engineers (IEEE)
Preprints posted in the last 90 days, ranked by how well they match IEEE Journal of Biomedical and Health Informatics's content profile, based on 34 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.
YILDIZ, O.; Subasi, A.
Show abstract
Stress detection with wearable physiological sensors is vital in digital health and affective computing. Conventional machine learning techniques usually examine physiological signals separately, missing the intricate inter-signal connections involved in the human stress response. While deep neural networks offer high accuracy, they function as black boxes, offering minimal understanding of the physiological processes behind stress detection. This study introduces a hierarchical graph neural network framework for WESAD stress detection, establishing a methodology for affective computing that emphasizes interpretability and extensibility while maintaining strong predictive performance. We proposed PAMG-AT (Physiological Attention Multi-Graph with Adaptive Topology) which is a hierarchical graph neural network architecture, for stress detection using multimodal physiological signals. In this framework, physiological features serve as nodes within a knowledge-driven graph, while edges represent established physiological relationships, including cardiac-electrodermal coupling and cardio-respiratory interaction. The architecture employs a three-level attention mechanism: spatial encoding via Graph Attention Networks (GAT) to assess feature importance, temporal modeling with a Transformer to capture dynamics across time windows, and global pooling for classification. The model is evaluated using three sensor configurations (chest-only, wrist-only, and hybrid) on the WESAD dataset, employing rigorous Leave-One-Subject-Out (LOSO) cross-validation. PAMG-AT achieves competitive performance, with 94.59% accuracy ({+/-}6.8%) for chest sensors, 91.76% ({+/-}9.2%) for wrist sensors, and 92.80% ({+/-}8.33%) for the hybrid configuration. The proposed method provides interpretability via attention weights, revealing that ECG-EDA relationships (cardiac-electrodermal coupling) are most predictive of stress. Three low-responder subjects (S2, S3, S9) with atypical physiological stress patterns demonstrate lower accuracy (81-87%), offering clinically valuable insights for personalized stress management. The effective wrist-only configuration, achieving 91.76% accuracy, supports practical deployment in consumer wearables.
Zoofaghari, M.; Rahaimifard, A.; Chatterjee, S.; Balasingham, I.
Show abstract
Goal-oriented semantic communication has recently emerged in wireless sensor-actuator networks, emphasizing the meaning and relevance of information over raw data delivery, thereby enabling resource-efficient telecommunication. This paradigm offers significant benefits for intra-body or implantable sensor-actuator networks, including dramatic reductions in bandwidth requirements, latency, and power consumption. In this paper, we address a patch-based energy-efficient anomaly detection method for smart capsule endoscopy. We propose a deep learningbased algorithm that employs the similarity between features extracted from measured images and a reference (normal) image as the detection metric. The algorithm is evaluated using a clinical dataset of capsule-captured images, combined with a simulated intra-body channel model. The results demonstrate that even with only 60% of the transmission power (relative to a standard link design for QPSK modulation) and 65% of the light intensity, the probability of anomaly detection remains above 85%, and it gradually improves as power and illumination levels increase. This improvement translates into a potential battery life extension of over 43%. The findings highlight the potential of semanticaware, energy-efficient intra-body devices for more sustainable and effective medical interventions.
Sakurai, R.; Kojima, S.; Otake-Matsuura, M.; Kanoh, S.; Rutkowski, T. M.
Show abstract
Traditional psychiatric assessments for depression are often hindered by subjective bias and patient recall in-accuracy. This paper presents a multimodal passive Brain-Computer Interface (pBCI) designed for the objective screening of depressive traits through the end-to-end decoding of neural dynamics. We implemented a hybrid EEG-fNIRS framework to capture synchronized electro-hemodynamic responses during an emotional working memory (EWM) task. To classify sub-clinical depressive tendencies based on BDI-II scores, we utilized SincShallowNet, a deep learning architecture optimized for raw signal processing via learnable Sinc-filters. Our results demonstrate that the pBCI achieves peak performance in the auditory modality, with the integration of EEG and low-pass filtered fNIRS (0.15 Hz) yielding a balanced accuracy of 90.9% and an F1-score of 0.867. By isolating purely endogenous neural markers during the EWM maintenance phase, the system provides a robust "silent observer" for mental state monitoring. These findings validate the potential of multimodal pBCIs as high-precision, data-driven tools for early-stage depression screening, offering a scalable alternative to traditional clinical interviews and a foundation for longitudinal mental health monitoring.
He, A.; Wang, X.; Yu, J.; Wang, X.; Ge, Z.; Kong, Y.; Yang, G.; Yang, C.; Yang, C.; Cao, M.
Show abstract
Electroencephalography (EEG) serves as a fundamental tool in modern neurology, cognitive neuroscience, and brain-computer interfaces, but its practical application is often compromised by artifacts. Physiological artifacts are particularly intractable due to overlapping spectral features with neural signals, hindering reliable EEG interpretation. In this work, we propose Grid-based 3D Convolution-Transformer (G3DCT), an interpretable deep learning framework for EEG artifact identification. The framework embeds multi-channel EEG signals into fixed grids to leverage electrode spatial topology, employs parallel multi-branch temporal convolutions and Transformers to handle complex artifacts, and incorporates an attention module to visualize scalp activation patterns, which enhances physiological interpretability. Our evaluation on three datasets demonstrates that G3DCT outperforms existing state-of-the-art models. For challenging combined artifacts, it secures a gain of 2.8% in F1-score over the second-best model. These results demonstrate that G3DCT provides an efficient and robust solution for EEG artifact identification, which has the potential to enhance the reliability of EEG-based applications in practice.
Sammartino, L.; Necki, M.; Kounetas, J.; Batty, A.; Smith, C.; O'Neill, F.; Collins, D. J.
Show abstract
Despite significant advancements in diagnostic methodologies, the fundamental approach to venous blood collection has remained largely unchanged since its widespread adoption in the mid-20th century. This stagnation poses considerable challenges, particularly in scenarios involving difficult venous access (DVA). The Bloody-Easy (BE) device represents a novel, passive, low-volume blood collection system engineered to optimize phlebotomy outcomes, especially in these challenging clinical contexts. Our prospective, randomized, crossover study involving 90 healthy volunteers demonstrates that BE achieved comparable or superior sample quality while significantly reducing the volume of blood drawn per session. Furthermore, the device garnered substantial positive feedback from both patients and clinicians. BE offers a potentially cost-neutral and low-risk solution for improving blood collection efficiency and patient experience in critical care, emergency medicine, and paediatric settings, where conventional phlebotomy techniques frequently encounter limitations.
Hu, X.
Show abstract
Autism spectrum disorder (ASD) affects a substantial proportion of children worldwide, yet clinical assessment of symptom severity remains resource-intensive and unevenly accessible. Artificial intelligence (AI) has transformative potential to support scalable and timely severity assessment from behavioral data, but existing approaches largely treat autism as a monolithic prediction target and rely on opaque models that are difficult for clinicians to interpret or trust. Moreover, prior multimodal methods typically integrate heterogeneous behavioral signals using ad hoc fusion strategies that are weakly grounded in clinical theory. We propose a clinical theory-driven deep learning model for interpretable autism severity assessment that explicitly operationalizes established clinical constructs into model design. Drawing on autism research, we represent social construct and motor construct as distinct latent components. These components are integrated through a structured cross-modal attention mechanism guided by a learnable alignment mask that encodes soft spatial correspondence priors between visual and kinematic representations. Theory-specific blocks then aggregate aligned tokens into construct embeddings, which are fused via instance-specific theory weights, yielding transparent symptom profiles aligned with clinical reasoning. Comprehensive experiments demonstrate the state-of-the-art performance of our model over existing baselines. Ablation studies validate that performance gains arise from theory-driven design choices. Analysis of the learned theory weights reveals systematic relationships between symptom profiles and severity, providing empirical support for the multidimensional structure of autism. This work demonstrates how clinical theory can be instantiated as empirically testable architectural designs in deep learning models, advancing both predictive utility and interpretability in healthcare AI systems.
Huang, X.; Hsieh, C.; Nguyen, Q.; Renteria, M. E.; Gharahkhani, P.
Show abstract
Wearable-derived physiological features have been associated with disease risk, but most current studies focus on single conditions, limiting understanding of cross-disease patterns. This study adopts a trans-diagnostic approach to examine whether wearable data capture shared and condition-specific physiological signatures across multiple chronic conditions spanning physical and mental health, and then evaluates the utility of these features for disease classification. A total of 9,301 patients with at least 21 days of consecutive FitBit data from the All of Us Controlled Tier Dataset version 8 were analyzed. Disease subcohorts included cardiovascular disease (CVD), diabetes, obstructive sleep apnea (OSA), major depressive disorder (MDD), anxiety, bipolar disorder, and attention-deficit/ hyperactivity disorder (ADHD), chosen based on prevalence and relevance. Logistic regression and XGBoost models were fitted for each disease subcohort versus the control cohort. We found that compared to using just baseline demographic and lifestyle features, incorporating wearable-derived features enabled improved classification performance in all subcohorts for both models, except for ADHD where improvement was mainly observed for ROC-AUC in logistic regression model likely due to the smaller sample size in ADHD subcohort. The largest performance gains were observed in MDD (increase in ROC-AUC of 0.077 for Logistic regression, 0.071 for XGBoost; p < 0.001) and anxiety (increase in ROC-AUC of 0.077 for logistic regression, 0.108 for XGBoost; p < 0.001). This study provides one of the first comprehensive transdiagnostic evaluations of wearable-derived features for disease classification, highlighting their potential to enhance risk stratification in the real-world setting as a practical complement to clinical assessments and providing a foundation to explore more fine-grained wearable data. Author summaryWearable devices such as fitness trackers and smartwatches are becoming increasingly popular and affordable, providing continuous measurements of heart rate, physical activity, and sleep. Alongside the growing digitization of health records, this creates new opportunities for large-scale, real-world health studies. In this study, we analyzed wearable-derived physiological patterns across a range of chronic conditions spanning both physical and mental health to better understand how these signals relate to disease risk. We found that incorporating wearable-derived heart rate, activity and sleep features improved disease risk classification across several conditions, with particularly strong gains for major depressive disorder and anxiety. By examining how individual features contributed to model predictions, we also identified meaningful associations between physiological signals and disease risk. For example, both duration and day-to-day variation of deep and rapid eye movement (REM) sleep were associated with increased risk in certain conditions. Our study supports the development of real-time, automated tools to assess disease risk alongside clinical care.
Shahriar, K. A.
Show abstract
Parkinsons disease is a progressive neurological disorder characterized by motor impairments whose severity is commonly assessed using the Unified Parkinsons Disease Rating Scale (UPDRS). Although clinically established, UPDRS assessment requires in-person evaluation by trained specialists and is inherently subjective, limiting its suitability for frequent monitoring. Speech production is affected early in Parkinsons disease and provides a non-invasive modality for remote symptom assessment. In this study, an uncertainty-aware personalized framework is proposed for estimating Parkinsons disease severity from speech signals. The approach integrates longitudinal temporal modeling of longitudinal speech recordings with patient-specific representations and a probabilistic latent disease state. Continuous motor UPDRS scores are estimated jointly with ordinal disease severity stages, enabling both fine-grained regression and clinically interpretable stratification. Predictive uncertainty is explicitly quantified, yielding confidence-aware severity estimates suitable for telemonitoring applications. The method is evaluated on a longitudinal speech dataset using a strict patient-wise split, ensuring that all test subjects are unseen during training. On the held-out test set, the proposed model achieves high predictive accuracy (mean absolute error 0.56 UPDRS points, root mean squared error 0.74, and coefficient of determination R2 = 0.99) for motor UPDRS estimation. Ordinal severity classification attains an accuracy of 0.92 across mild, moderate, and severe disease stages. Comparative experiments against classical machine learning methods and global temporal baselines demonstrate consistent performance improvements.These results indicate that personalized, uncertainty-aware modeling of speech signals can support accurate and clinically meaningful remote monitoring of Parkinsons disease severity.
Chen, Z.; Wu, R.; Liu, Y.; Li, R.; Duprey, A.
Show abstract
The integration of Large Language Models into high-stakes clinical workflows is critically hampered by their lack of verifiable reliability and tendency to generate hallucinations. This paper introduces Med-ICE, an autonomous framework designed to enhance the reliability of LLMs for medical applications. Med-ICE adapts the Iterative Consensus Ensemble paradigm, enabling a group of peer LLM agents to collaboratively converge on a final answer through iterative rounds of generation and peer review, thereby eliminating the need for an external arbiter and its associated scalability bottleneck. Our work makes three key contributions: (1) a novel semantic consensus mechanism that determines agreement based on semantic similarity, crucial for nuanced clinical language; (2) demonstration of state-of-the-art performance, where Med-ICE significantly outperforms both direct single-LLM generation and the Self-Refinement technique on challenging medical benchmarks; and (3) a highly efficient and scalable architecture, as our Semantic Consensus Monitor is computationally lightweight. This research establishes a new standard for developing safer, more trustworthy LLM systems, paving the way for their responsible integration into medicine.
Li, D.; Fu, C.-H.; Tang, K.
Show abstract
The human face is a rich medium for biometric, behavioral, and clinical information. However, 2D facial images based technologies lack critical geometric details and are susceptible to pose and illumination interference, while 3D facial deep learning frameworks are hindered by complex annotation, preprocessing, and task-specific designs with poor cross-domain generalization. To address these challenges, we propose UniFacePoint-FM, a 3D facial foundation model built on a self-supervised Point-MAE framework, tailored for high-fidelity point cloud representation learning. The model was pretrained on a self-constructed dataset of high-resolution 3D facial scans, followed by supervised fine-tuning and comprehensive evaluation across three independent datasets for diverse downstream tasks. Experimental results demonstrate that UniFacePoint-FM is both pretraining-efficient and highly generalizable: it achieves state-of-the-art performance on gender classification, age regression, and BMI prediction, and matches the accuracy of the ResMLP model (while outperforming other baselines) in facial expression recognition. Notably, by learning high-quality, fine-grained representations directly from raw point clouds, UniFacePoint-FM delivers robust generalization and transferability across tasks, datasets, and even different face scanning platforms. Overall, our work establishes an effective foundation model paradigm for 3D facial analysis, with promising implications for biometric security, health monitoring, and advanced human-computer interaction systems.
Ogretir, M.; Kaipainen, V.; Leskinen, M.; Lahdesmaki, H.; Koskinen, M.
Show abstract
Neonates requiring intensive care are at increased risk for long-term neuropsychiatric disorders. However, clinical adoption of risk prediction models remains limited when their performance lacks adequate interpretability for informed clinical decision-making. Here, we investigated whether longitudinal neonatal electronic health record (EHR) data from the first 90 days of life can support clinically meaningful interpretation of long-term risk signals for major neuropsychiatric diagnoses by age seven. In a retrospective register-based cohort of 17,655 at-risk children from an academic medical center, of whom 8.0\% (1,420) received a major neuropsychiatric diagnosis during follow-up, we applied a time-aware transformer model (Self-supervised Transformer for Time-Series; STraTS) and thoroughly evaluated its predictions using three complementary interpretability approaches: perturbation-based variable importance, value-dependent effect analysis, and leave-one-out (LOO) feature attribution. STraTS achieved the highest area under the precision--recall curve (AUPRC 0.171 {+/-} 0.022), compared with Random Forest (0.166 {+/-} 0.008), logistic regression (0.151 {+/-} 0.007), and XGBoost (0.128 {+/-} 0.010). Across interpretability methods, five predictors were consistently identified: birth weight, gender, Apgar score at 1 minute, umbilical serum thyroid stimulating hormone (uS-TSH), and treatment time in hospital. Indicators of early clinical severity, including chromosomal abnormalities and neonatal cerebral-status disturbances, showed the largest risk-increasing effects. Furthermore, the model's learned vector representations of subject-specific EHR sequences formed clinically coherent latent embeddings that reflect population heterogeneity along established perinatal risk dimensions. These findings demonstrate that combining multiple complementary interpretability methods yields stable, clinically plausible risk signals while revealing limitations that would remain undetected by any single approach, highlighting the importance of careful interpretability analysis of deep learning-based risk predictions.
Specht, B.; Tayeb, Z. Z.; Garbaya, S.; Khadraoui, D.; EL-Khozondar, M.; Schneider, R.
Show abstract
Accurate inference of physiological state across the menstrual cycle has important applications in reproductive health and in understanding symptom dynamics, yet most non-hormonal approaches rely on wearable sensors or calendar-based tracking. Whether self-reported symptoms alone can support prospective, cross-subject phase classification remains unresolved. Here, we introduce a hybrid modelling framework that combines a gradient-boosted classifier with a Hidden Semi-Markov Model to infer four menstrual cycle phases (menstrual, follicular, fertile, and luteal) from self-reported data. The classifier captures non-linear symptom patterns, while the temporal model imposes biologically grounded constraints, including cyclic ordering and realistic phase durations. In a leave-one-subject-out evaluation using hormonally annotated data from 41 participants, the model achieved 67.6\% accuracy and a macro F1 score of 0.662. Features reflecting short-term symptom variability were more informative than absolute symptom levels, indicating that within-person fluctuation provides a more generalisable signal of cycle phase than symptom intensity alone. These findings demonstrate the feasibility of low-burden, device-free menstrual health monitoring, establish symptom dynamics as a basis for scalable digital biomarkers, and expand access to tracking in resource-constrained settings and populations underserved by wearable-based approaches.
Yin, Z.; Wang, S.; Moraros, J.
Show abstract
BackgroundScalp electroencephalography (EEG) based seizure prediction plays a critical role in improving the quality of life for patients with drug-resistant epilepsy, offering the potential for real-time warnings and timely interventions. Despite its clinical significance and decades of research, the field still lacks an open benchmark with reproducible baselines and deployment-oriented event-level evaluation. Most prior work relies on the small and outdated Childrens Hospital Boston (CHB-MIT) dataset and reports window-level metrics only, leaving the false-alarm burden of a real warning system underspecified. In seizure prediction, the cost of false alarm is significantly high since patients may receive painful electrical stimulation to suppress seizure. Hence, false alarms per hour (FA/h) and partial AUC (pAUC) are the most deployment-relevant metrics, reflecting alarm burden and discriminability in the low-false-alarm operating region that a usable warning system can realistically tolerate. However, few studies have systematically reported such metrics. In addition, vision transformers event-level performance under deployable FA/h constraints remains underexplored, and newer backbones such as MambaVision have yet to be evaluated under this setting. MethodsIn this work, we introduce a reproducible 5-fold benchmark derived from the Temple University Hospital EEG Seizure Corpus (TUSZ) dataset, and evaluate models using a pseudo-real-time event pipeline, reporting event-level sensitivity, false alarms per hour (FA/h) and partial AUC (pAUC). All models are compared to random predictors for statistical validation. We benchmark pre-trained vision transformers (SegFormer and MambaVision) under three EEG-to-image encoding methods, including a self-proposed Temporal-Patchify encoding for SegFormer. ResultsOur proposed Temporal-Patchify encoding method achieves state-of-the-art performance. We achieved 0.61 pAUC, which is 16.2% higher than the baseline Temporal-Tile SegFormer of Parani et al. The false-alarm burden (0.40{+/-}0.28 FA/h) is 44.4% lower than the Temporal-Tile SegFormer baseline while maintaining clinically usable sensitivity (60.7%{+/-}5.0%). We further perform statistical validation against a matched Poisson random predictor, confirming performance exceeds chance. Finally, we report end-to-end inference through-put up to 920 windows/s, confirming MambaVisions fastest inference speed, exceeding SegFormer by over 20%. ConclusionsThis work bridges the gap between seizure prediction algorithms and clinically usable seizure prediction systems in real-world settings. Our findings indicate that pre-trained vision transformers, when coupled with appropriate EEG encoding methods, can achieve robust performance in low-false-alarm operating regimes, which is critical for real-world deployment. This benchmark and evaluation framework may facilitate more clinically meaningful and reproducible seizure prediction research.
Rahjouei, A.
Show abstract
Actigraphy is widely used for long-term sleep monitoring, but established sleep-wake scoring algorithms often require parameter tuning, which is commonly performed manually and can reduce reproducibility. In this study, a grid-search-based calibration framework is presented for established actigraphy algorithms and evaluate whether it can serve as a practical alternative to manual tuning. The method was evaluated using two datasets: a multi-subject polysomnography-validated actigraphy dataset and a self-collected dual-device dataset. In the polysomnography-validated dataset, grid-search optimization produced performance patterns similar to manual parameter selection, while slightly improving detection of sleep onset and sleep offset and yielding modest gains in wake-sensitive metrics. In the dual-device dataset, consensus and majority voting were useful for reducing the influence of brief wake episodes occurring within the main sleep period, including micro-awakenings that can fragment sleep predictions across individual algorithms. Overall, these findings show that grid-search can replace manual parameter tuning with a more explicit and reproducible procedure while providing small improvements in sleep timing estimation and benefiting ensemble-based handling of within-sleep wakefulness.
Singh, A.; Infante, S.; Kim, S.; Kabir, A.
Show abstract
Pregnancy care often involves simultaneous obstetric and other medical conditions, but their co-occurrence patterns are rarely modeled explicitly in a systematic, network-based approach. In this work, we formulate obstetric and non-obstetric diagnoses co-occurrences as a link prediction problem on a diagnosis-level homogeneous graph constructed from pregnancy encounters. Diagnoses are represented as nodes connected by co-occurrence edges, with node features capturing graph structure and demographic statistics3. We address this challenge by leveraging collected electronic health records data and study several standalone and hybrid graph neural network (GNN) architectures, including GCN, GAT, GraphSAGE, and three hybrid encoders that combine complementary aggregation mechanisms, namely GCN+GraphSAGE, GCN+GAT, and GAT+GraphSAGE. All models used consistent train-validation-test splits and are evaluated on 5- fold cross-validation sets. Among standalone models, GraphSAGE achieved the strongest performance, whereas hybrid GraphSAGE-based models (GCN+GraphSAGE and GAT+GraphSAGE) are best performers. The GCN+GraphSAGE hybrid, reaching an AUROC and AUPRC of approximately 0.90, consistently outperformed all other architectures. Further analysis of top-ranked predicted links revealed clinically plausible associations between pregnancy stage and risk-related diagnoses and common endocrine, metabolic, and hematological conditions. These findings indicate that graph-based link prediction may effectively prioritize obstetric and non-obstetric diagnosis pairs, providing a scalable framework for identifying clinically meaningful comorbidity patterns. They may further support hypothesis generation and downstream obstetric risk stratification efforts. AvailabilityAll codes including data preparation scripts, training and validation recipes, and experimental configurations are available at: https://github.com/kabir-ai2bio-lab/ob-nonob-diagnoses-cooccurrences.
Georgiou, G. P.; Paphiti, M.
Show abstract
Autism spectrum disorder (ASD) is a neurodevelopmental condition for which timely and accurate detection remains a major clinical priority. Early and reliable identification is important because it can facilitate access to assessment, diagnosis, and appropriate support; however, current diagnostic pathways still rely largely on behavioural evaluation and clinical judgement. In this context, machine-learning (ML) approaches have attracted growing interest because they can identify subtle and complex patterns in speech data that may not be easily captured through conventional methods. The current study capitalizes on this potential by developing and evaluating ML models for distinguishing autistic individuals from neurotypical individuals based on speech features. More specifically, acoustic features of vowels, including fundamental frequency (F0), first three formants (F1, F2, F3), duration, jitter, shimmer, harmonics-to-noise ratio (HNR), and intensity, were elicited from 18 autistic adults and 18 neurotypical adults through a controlled production task. Then, four supervised ML models were trained and evaluated on these features: LightGBM, Random Forest, Support Vector Machine, and XGBoost. All models demonstrated good classification performance, with the best-performing model achieving a strong discriminability of 89%. The explainability analysis identified F0 as the most influential predictor by a substantial margin, followed by intensity, F3, and F1, while duration, shimmer, HNR, jitter, and F2 contributed more modestly. These findings demonstrate that vowel acoustics contain clinically relevant information for distinguishing autistic from neurotypical adult speech and highlight the potential of interpretable, speech-based ML as a transparent and scalable aid for ASD screening and assessment.
Islam, N.; Luo, C.; Tong, J.; Polleya, D. A.; Jordan, C. T.; Haverkos, B.; Bair, S.; Kent, A.; Weller, G.
Show abstract
Cox proportional hazard regressions are frequently employed to develop prognostic models for time-to-event data, considering both patient-specific and disease-specific characteristics. In high-dimensional clinical modeling, these biological features can exhibit high collinearity due to inter-feature relationships, potentially causing instability and numerical issues during estimation without regularization. For rare diseases such as acute myeloid leukemia (AML), the sparsity and scarcity of data further complicate estimation. In such cases, data augmentation through multi-site collaboration can alleviate these problems. However, this often necessitates sharing individual patient data (IPD) across sites, which presents challenges due to regulatory barriers aimed at protecting patient privacy. To overcome these challenges, we propose a privacy-preserving algorithm that eliminates sharing IPD across sites and fits a federated penalized piecewise exponential model (FedPPEM) to estimate potential effects of clinical features using summary statistics. This algorithm yields results nearly identical to those from pooled IPD, including effect size and standard error estimates. We demonstrate the models performance in quantifying effects of clinical features and genetic risk classification on overall survival using real-world data from [~]1200 newly diagnosed AML patients across 33 U.S. sites. Although applied in AML context, this model is disease-agnostic and can be implemented in other diseases and clinical contexts.
Daya, N. R.; Wang, D.; Zhang, S.; Fang, M.; Wallace, A.; Zeger, S.; Selvin, E.
Show abstract
In this article, we present the cgmstats package for the analysis of continuous glucose monitoring (CGM) data. The use of wearable CGMs is growing rapidly. The latest generation of CGM systems do not require fingerstick calibration, are minimally invasive, and are frequently used in research studies. CGM sensors are typically worn for up to 2 weeks and record interstitial glucose measurements every minute to every 15 minutes, depending on the sensor used. CGM systems generate hundreds of measurements per day and thousands of measurements in one person over a single wear. There is a need for tools that allow researchers to efficiently organize and summarize the wealth of data on glucose patterns produced by CGM systems. The cgmstats package generates CGM summary measures for data from a variety of CGM systems and allows the user to flexibly define ranges and generate data visualizations. In this article, we provide an overview of the cgmstats package and examples of its use. The cgmstats package supports rigorous and reproducible analyses of CGM data.
Zhang, X.; Fang, Z.; Tang, K.; Chen, H.; Li, J.
Show abstract
Targeted drug therapies offer a promising approach for treating complex diseases, with combinational drug therapies often employed to enhance therapeutic efficacy. However, unintended drug-drug interactions may undermine treatment outcomes or cause adverse side effects. In this work, we propose a novel joint learning framework for the simultaneous prediction of effective drug combinations and drug-drug interactions, based on coupled tensor-tensor factorization. Specifically, we model drug combination therapies and DDI by representing drug-drug-disease associations and drug-drug interaction profiles as coupled three-way tensors. To address the challenges of data incompleteness and sparsity, the proposed model integrates auxiliary drug similarity information, such as chemical structure similarities, drug-specific side effects, drug target profiles, and drug inhibition data on cancer cell lines, within a multi-view learning frame-work. For optimization, we adopt a modified Alternating Direction Method of Multipliers (ADMM) algorithm that ensures convergence while enforcing non-negativity constraints. In addition to standard tensor completion tasks, we further evaluate the proposed method under a more realistic new-drug prediction setting, where all interactions involving a previously unseen drug are withheld. This scenario closely aligns with real-world applications, in which reliable predictions for emerging or under-studied compounds are essential. We evaluate the proposed method on a comprehensive dataset compiled from multiple sources, including DrugBank, CDCDB, SIDER, and PubChem. Our experiments show that SI-ADMM maintains robust performance and achieves the best results comparing to other tensor factorization approaches, with or without auxiliary information, particularly in the new-drug prediction setting. The implementation of our method is publicly available at: https://github.com/Xiaoge-Zhang/SI-ADMM.
Wang, G.; Yang, S.; Ding, J.-e.; Zhu, H.; Liu, F.
Show abstract
Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized largescale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro-spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space-inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrators capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time-frequency-aware, open-ended clinical interpretation of electrophysiological data.